Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers
نویسندگان
چکیده
This paper describes a unified architecture for integrating sub-lexical models with speech recognition, and a layered framework for context-dependent probabilistic hierarchical sublexical modelling. Previous work [1, 2, 3] has demonstrated the effectiveness of sub-lexical modelling using a core context-free grammar (CFG) augmented with context-dependent probabilistic models. Our major motivation for designing a unified architecture is to provide a framework such that probabilistic sublexical components can be integrated with other speech recognition components without sacrificing the flexibilities of their independent developments and configurations. At the same time, we are able to obtain a tightly coupled interface between recognizers and sub-lexical linguistic components. We also present a view of using layered probabilistic models to augment CFGs. It captures context-dependent probabilistic information beyond the standard CFG formalism, and provides the flexibility of developing suitable probabilistic models independently for each sub-lexical layer. Experimental results show that the context-dependent probabilistic hierarchical sub-lexical modelling approach can achieve comparable performance to pronunciation network approaches on utterances that contain only in-vocabulary words, while being able to substantially reduce errors on utterances with previously unseen words.
منابع مشابه
Sub-lexical Modelling Using a Finite State Transducer Framework1
The finite state transducer (FST) approach [1] has been widely used recently as an effective and flexible framework for speech systems. In this framework, a speech recognizer is represented as the composition of a series of FSTs combining various knowledge sources across sub-lexical and high-level linguistic layers. In this paper, we use this FST framework to explore some sublexical modelling a...
متن کاملIntegration of supra-lexical linguistic models with speech recognition using shallow parsing and finite state transducers
This paper proposes a layered Finite State Transducer (FST) framework integrating hierarchical supra-lexical linguistic knowledge into speech recognition based on shallow parsing. The shallow parsing grammar is derived directly from the full fledged grammar for natural language understanding, and augmented with top-level n-gram probabilities and phrase-level context-dependent probabilities, whi...
متن کاملSilence models in weighted finite-state transducers
We investigate the effects of different silence modelling strategies in Weighted Finite-State Transducers for Automatic Speech Recognition. We show that the choice of silence models, and the way they are included in the transducer, can have a significant effect on the size of the resulting transducer; we present a means to prevent particularly large silence overheads. Our conclusions include th...
متن کاملA Unified Framework for Sublexical and Linguistic Modelling Supporting Flexible Vocabulary Speech Understanding1
In [9], we introduced the ANGIE framework for modelling speech where morphological and phonological substructures of words are jointly characterized by a context-free grammar and represented in a multi-layered hierarchical structure. In [6], we demonstrated a competitive word-spotter based on the ANGIE framework and presented several results comparing the performance of various sublexical fille...
متن کاملAn efficient implementation of phonological rules using finite-state transducers
Context-dependent phonological rules are used to model the mapping from phonemes to their varied phonetic surface realizations. Others, most notably Kaplan and Kay, have described how to compile general context-dependent phonological rewrite rules into finite-state transducers. Such rules are very powerful, but their compilation is complex and can result in very large nondeterministic automata....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001